Kang
Li, Tianjin University, China, lanelim@foxmail.com PRIMARY
Fan-Rui
Sun, Tianjin University, China, frsun@foxmail.com
Jie
Li, Tianjin University, China, vassilee@tju.edu.cn SUPERVISOR
Kang
Zhang, Tianjin University, China, kzhang@tju.edu.cn SUPERVISOR
Student
Team: YES
Did you
use data from both mini-challenges?
NO
D3
MySql
Approximately how many hours were spent working on
this submission in total?
About 60 hours (60 days and 1 hours per day)
May we post your submission in the Visual Analytics
Benchmark Repository after VAST Challenge 2015 is complete? YES
Video:
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Approach Description
We
attempt to use a multi-attribute ranking visualization to resolve the MC 2, as
shown in Fig. 0.
(e) 5-attribute (d) 2-attribute (c) 3-attribute (b) 2-attribute (a) 1-attribute
Fig. 0. Multi-attribute
ranking. (a-c) illustrates a quadtree metaphor. Each level
represents an attribute and is divided into 4 groups having the same number of
IDs. The grayscale of a grid represents the number of IDs belonging to the
grid. (d) Five important attributes. (e) An example containing
all the five attributes, that uses grayscale to represent the number of IDs. By
observing the grayscale distribution, various communication patterns can be
identified.
Questions
MC2.1 – Identify those IDs that stand out for their large volumes of communication. For each of these IDs
a. Characterize the communication patterns you see.
b. Based on these patterns, what do you hypothesize about these IDs?
a.
Characterize the communication patterns you see.
We identify the top five IDs who have the
largest sum of all the sent and received messages each day.
Communication pattern 1:
Examples: 839736
The
five attributes of this type of IDs are all at level 1, and tended to send
group messages, and sent and received roughly equal amount of messages. They
always replied to the
messages they received and also received replies whenever they sent.
We believe that these
IDs are the site staff who maintain
various facilities on site.
Communication pattern 2:
Examples: (Fri: 1508923)(Sat:1351786 and 1292409)(Sun: 620184, 19249 and 1952914)
This type of IDs tended to send large volumes of messages, only during
the park’s opening hours.
We believe that these IDs are security staff, site director, tour
guides, or tour organizers.
Communication pattern 3:
For instance, (Fri: 1278894)(Sat: 1278894)(Sun: 1278894)
This type
of IDs showed unusual behaviors. They sent large volumes of messages to a large
number of people regularly every 2 hours.
We believe
that these IDs machines that kept tracking the number of visitors entering the
park and the number of visitors not using any facilities.
Communication pattern 4:
For
instance, (Fri: 825466 and 809736)(Sat: 2082743)
This type
of IDs tended to send group messages only when the park is open, and sent to
far more people than those they received from.
We believe
that these IDs are visitors who liked to initiate communication.
(a) Friday
(b) Saturday
(c) Sunday
Fig.1-1 Top
5 IDs and their attributes including the volumes of sending and receiving
messages each day and the sum of the first two attributes.
(a) The top 5
IDs’ detailed information on Friday.
(b) The top 5
IDs’ detailed information on Saturday.
(c) The top 5
IDs’ detailed information on Sunday.
Fig.1-2 The top
5 IDs’ detailed information on three days.
b. Based
on these patterns, what do you hypothesize about these IDs?
839736, 1278894 => machines
1508923, 1351786, 1292409 => staff
825466, 809736, 2082743, 620184, 19249, 1952914 =>
visitors
MC2.2 – Describe up to 10 communications patterns in the data. Characterize who is communicating, with whom, when and where. If you have more than 10 patterns to report, please prioritize those patterns that are most likely to relate to the crime. Limit your response to no more than 10 images and 1000 words.
1. This type of IDs is one of the major types of visitors, mostly engaged in one-to-one communication, with a few group messaging.
Fig.2-1
Communication pattern 1
2. This type of IDs is one of the major types of visitors, frequently engaged in one-to-one communication.
Fig.2-2
Communication pattern 2
3. This type of IDs sent mass group messages frequently.
We believe that they are tour organizers, tour guides and staff of the park.
Fig.2-3
Communication pattern 3
4. This type of IDs is one of the major types of visitors, engaged
mostly in one-to-one communication, with a few group messaging. They sent far more messages than they
received.
We believe that this type of visitors made most use of the park facilities and liked to initiate communication.
Fig.2-4
Communication pattern 4
5. This type of IDs always engaged in one-to-one communication, low volumes of messages. They sent messages upon receiving from those who sent messages every two hours.
We believe that they are mostly single or small-group visitors.
Fig.2-5
Communication pattern 5
6. This type of IDs primarily engaged in group messaging, and received far more messages than they sent, particularly high during the opening hours.
We believe that
these IDs are lottery machines, staff handling complaints, tour organizers, and cheer
leaders.
Fig.2-6
Communication pattern 6
7. This type of IDs never receives any messages, and sent few messages, always on one-to-one communication.
We believe that these IDs are single visitors who may communicate with the outside, or the people with special purposes.
Fig.2-7
Communication pattern 7
8. This type of IDs sends to and receives from exactly the same set of IDs, sends and receives exactly one message to each of their communication targets.
We believe that these IDs either had no desire to communicate, only communicating with staff, or with special purposes since they minimized their communication to hide themselves.
Fig.2-8
Communication pattern 8
9. This type of IDs liked to send messages to others in high volume, yet ignored part of the received messages.
We believe that these IDs were people with special purposes or visitors.
Fig.2-9
Communication pattern 9
MC2.3 – From this data, can you hypothesize when the crime was discovered? Describe your rationale.
Minute
Limit your response to no more than 3 images and 300 words.
Minute
Hour
s
(a) Calendar (b) Clock metaphor
Fig.3-1.
Two types of calendar Heatmap.
The periods on Sunday highlighted by two
yellow rectangles in Figure 3-1 (a) illustrate that, an abnormal communication pattern appeared, clearly
distinct from patterns at the same time on Friday and Saturday, with over twice
communication volumes more than the other two days. We therefore suspect that
the incident was discovered at around 11:36 on Sunday (pointed to by the short
black arrow on the clock metaphor in Figure 3-1 (b)). At 12 o’ clock on Sunday
(pointed to by the long red arrow), a large volume of communication data
occurred continuously for about 30 minutes, spreading the news, via broadcast
channels by staff ID 839736. The communication messages were to inform all the
park staff of the event and explain to all the visitors.